Genomic Distance with DCJ and Indels
نویسندگان
چکیده
The double cut and join (DCJ) operation, introduced by Yancopoulos, Attie and Friedberg in 2005, allows one to represent most rearrangement events in genomes. However, a DCJ cannot perform an insertion or a deletion and most approaches under this model consider only genomes with the same content and without duplications, including the linear time algorithms to compute the DCJ distance and to find an optimal DCJ sorting sequence. In this work, we compare two genomes with unequal content, but still without duplications, and present a new linear time algorithm to compute the genomic distance, considering DCJ and indel operations. With this method we find preliminary evidence of the occurrence of clusters of deletions in the Rickettsia bacterium.
منابع مشابه
Exemplar or Matching: Modeling DCJ Problems with Unequal Content Genome Data
The edit distance under the DCJ model can be computed in linear time for genomes with equal content or with Indels. But it becomes NP-Hard in the presence of duplications, a problem largely unsolved especially when Indels are considered. In this paper, we compare two mainstream methods to deal with duplications and associate them with Indels: one by deletion, namely DCJ-Indel-Exemplar distance;...
متن کاملAlgorithms for sorting unsigned linear genomes by the DCJ operations
MOTIVATION The double cut and join operation (abbreviated as DCJ) has been extensively used for genomic rearrangement. Although the DCJ distance between signed genomes with both linear and circular (uni- and multi-) chromosomes is well studied, the only known result for the NP-complete unsigned DCJ distance problem is an approximation algorithm for unsigned linear unichromosomal genomes. In thi...
متن کاملUniMoG—a unifying framework for genomic distance calculation and sorting based on DCJ
SUMMARY UniMoG is a software combining five genome rearrangement models: double cut and join (DCJ), restricted DCJ, Hannenhalli and Pevzner (HP), inversion and translocation. It can compute the pairwise genomic distances and a corresponding optimal sorting scenario for an arbitrary number of genomes. All five models can be unified through the DCJ model, thus the implementation is based on DCJ a...
متن کاملImplicit Transpositions in DCJ Scenarios
Genome rearrangements are large-scale evolutionary events that shuffle genomic architectures. The minimal number of such events between two genomes is often used in phylogenomic studies to measure the evolutionary distance between the genomes. Double-Cut-and-Join (DCJ) operations represent a convenient model of most common genome rearrangements (reversals, translocations, fissions, and fusions)...
متن کاملEstimating true evolutionary distances under the DCJ model
MOTIVATION Modern techniques can yield the ordering and strandedness of genes on each chromosome of a genome; such data already exists for hundreds of organisms. The evolutionary mechanisms through which the set of the genes of an organism is altered and reordered are of great interest to systematists, evolutionary biologists, comparative genomicists and biomedical researchers. Perhaps the most...
متن کامل